Sample Efficient Reinforcement Learning Method via High Efficient Episodic Memory
نویسندگان
چکیده
منابع مشابه
Sample Efficient Reinforcement Learning with Gaussian Processes
This paper derives sample complexity results for using Gaussian Processes (GPs) in both modelbased and model-free reinforcement learning (RL). We show that GPs are KWIK learnable, proving for the first time that a model-based RL approach using GPs, GP-Rmax, is sample efficient (PAC-MDP). However, we then show that previous approaches to model-free RL using GPs take an exponential number of step...
متن کاملEfficient Reinforcement Learning via Initial Pure Exploration
In several realistic situations, an interactive learning agent can practice and refine its strategy before going on to be evaluated. For instance, consider a student preparing for a series of tests. She would typically take a few practice tests to know which areas she needs to improve upon. Based of the scores she obtains in these practice tests, she would formulate a strategy for maximizing he...
متن کامل(More) Efficient Reinforcement Learning via Posterior Sampling
Most provably-efficient reinforcement learning algorithms introduce optimism about poorly-understood states and actions to encourage exploration. We study an alternative approach for efficient exploration: posterior sampling for reinforcement learning (PSRL). This algorithm proceeds in repeated episodes of known duration. At the start of each episode, PSRL updates a prior distribution over Mark...
متن کاملEfficient Bayes-Adaptive Reinforcement Learning using Sample-Based Search
Bayesian model-based reinforcement learning is a formally elegant approach to learning optimal behaviour under model uncertainty, trading off exploration and exploitation in an ideal way. Unfortunately, finding the resulting Bayes-optimal policies is notoriously taxing, since the search space becomes enormous. In this paper we introduce a tractable, sample-based method for approximate Bayesopti...
متن کاملSample-Efficient Reinforcement Learning through Transfer and Architectural Priors
Recent work in deep reinforcement learning has allowed algorithms to learn complex tasks such as Atari 2600 games just from the reward provided by the game, but these algorithms presently require millions of training steps in order to learn, making them approximately five orders of magnitude slower than humans. One reason for this is that humans build robust shared representations that are appl...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: IEEE Access
سال: 2020
ISSN: 2169-3536
DOI: 10.1109/access.2020.3009329